A Quadtree-Based Lightweight Data Compression Approach to Processing Large-Scale Geospatial Rasters

نویسندگان

  • Jianting Zhang
  • Simin You
چکیده

Huge amounts of geospatial rasters, such as remotely sensed imagery and environmental modeling output, are being generated with increasingly finer spatial, temporal, spectral and thematic resolutions. Given that CPUs on modern computer systems are three orders of magnitude faster than disk I/O speed and two orders of magnitude faster than network bandwidth, it becomes more and more beneficial to allocate computing power for real time compression and decompression to reduce I/O times in streaming large-scale geospatial rasters among CPU memory, disks and distributed computing nodes and file systems. In this study, we aim at developing a lightweight lossless data compression technique that balances the performance between compression and decompression for large-scale geospatial rasters. Our Bitplane bitmap Quadtree (or BQ-Tree) based technique encodes the bitmaps of raster bitplanes as compact quadtrees which can compress and index rasters simultaneously. The technique is simple by design and lightweight by implementations. Apart from computing Z-order codes for cache efficiency, only bit level operations are required. Extensive experiments using 36 rasters of the NASA Shuttle Range Topography Mission (SRTM) 30 meter resolution elevation data with 20 billion raster cells have shown that our BQ-Tree technique is more than 4X faster for compression and 36% faster for decompression than zlib using a single CPU core while achieving very similar compression ratios. Our technique further has achieved 10-13X speedups for compression and 4X speedups for decompression using 16 CPU cores. On the experiment machine equipped with dual Intel Xeon 8-core E5-2650V2 CPUs, our technique is able to compress the SRTM raster dataset in about 35 seconds and decompress back in 29 seconds. The performance compares favorably with the best known technique with respect to both compression and decompression throughputs. We have made the source code package and a subset of the SRTM raster publically available to facilitate validations and cross-comparisons.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Simplifying High-Performance Geospatial Computing on GPGPUs Using Parallel Primitives: A Case Study of Quadtree Constructions on Large-Scale Geospatial Rasters

The increasingly available Graphics Processing Units (GPU) hardware resources and the emerging General Purpose computing on GPU (GPGPU) technologies provide an alternative and complementary solution to existing cluster based high-performance geospatial computing. However, the complexities of the unique GPGPU hardware architectures and the steep learning curve of GPGPU programming have imposed s...

متن کامل

High-performance quadtree constructions on large-scale geospatial rasters using GPGPU parallel primitives

The increasingly available Graphics Processing Units (GPU) hardware and the emerging General Purpose computing on GPU (GPGPU) technologies provide an attractive solution to high-performance geospatial computing. In this study, we have proposed a parallel primitive based approach to quadtree construction by transforming a multidimensional geospatial computing problem into chaining a set of gener...

متن کامل

A MULTILEVEL PARALLEL AND SCALABLE SINGLE-HOST GPU CLUSTER FRAMEWORK FOR LARGE-SCALE GEOSPATIAL DATA PROCESSING Grant J. Scott and Kirk Backus University of Missouri Center for Geospatial Intelligence Columbia, Missouri, USA

Geospatial data exists in a variety of formats, including rasters, vector data, and large-scale geospatial databases. There exists an ever-growing number of sensors that are collecting this data, resulting in the explosive growth and scale of high-resolution remote sensing geospatial data collections. A particularly challenging domain of geospatial data processing involves mining information fr...

متن کامل

Supporting Web-Based Visual Exploration of Large-Scale Raster Geospatial Data Using Binned Min-Max Quadtree

Traditionally environmental scientists are limited to simple display and animation of large-scale raster geospatial data derived from remote sensing instrumentation and model simulation outputs. Identifying regions that satisfy certain range criteria, e.g., temperature between [t1,t2) and precipitation between [p1,p2), plays an important role in query-driven visualization and visual exploration...

متن کامل

Parallel Quadtree Coding of Large-Scale Raster Geospatial Data on Multicore CPUs and GPGPUs

Global remote sensing and large-scale environmental modeling have generated huge amounts of raster geospatial data. While the inherent data parallelism of large-scale raster geospatial data allows straightforward coarse-grained parallelization at the chunk level on CPUs, it is largely unclear how to effectively exploit such data parallelism on massively parallel General Purpose Graphics Process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014